Delta Compressed and Deduplicated Storage Using Stream-Informed Locality
نویسندگان
چکیده
For backup storage, increasing compression allows users to protect more data without increasing their costs or storage footprint. Though removing duplicate regions (deduplication) and traditional compression have become widespread, further compression is attainable. We demonstrate how to efficiently add delta compression to deduplicated storage to compress similar (nonduplicate) regions. A challenge when adding delta compression is the large number of data regions to be indexed. We observed that stream-informed locality is effective for delta compression, so an index for delta compression is unnecessary, and we built the first storage system prototype to combine delta compression and deduplication with this technology. Beyond demonstrating extra compression benefits between 1.4-3.5X, we also investigate throughput and data integrity challenges that arise.
منابع مشابه
Avoiding the Disk Bottleneck in the Data Domain Deduplication File System
Disk-based deduplication storage has emerged as the new-generation storage system for enterprise data protection to replace tape libraries. Deduplication removes redundant data segments to compress data into a highly compact form and makes it economical to store backups on disk instead of tape. A crucial requirement for enterprise data protection is high throughput, typically over 100 MB/sec, w...
متن کاملDupLESS: Server-Aided Encryption for Deduplicated Storage
Cloud storage service providers such as Dropbox, Mozy, and others perform deduplication to save space by only storing one copy of each file uploaded. Should clients conventionally encrypt their files, however, savings are lost. Message-locked encryption (the most prominent manifestation of which is convergent encryption) resolves this tension. However it is inherently subject to brute-force att...
متن کاملFile System Support for Delta Compression
Delta compression, which consists of compactly encoding one file version as the result of changes to another, can improve efficiency in the use of network and disk resources. Delta compression techniques are readily available and can result in compression factors of five to ten on typical data, however managing delta-compressed storage is difficult. I present a system that attempts to isolate t...
متن کاملMemory efficient sanitization of a deduplicated storage system
Sanitization is the process of securely erasing sensitive data from a storage system, effectively restoring the system to a state as if the sensitive data had never been stored. Depending on the threat model, sanitization could require erasing all unreferenced blocks. This is particularly challenging in deduplicated storage systems because each piece of data on the physical media could be refer...
متن کاملPractical Web-based Delta Synchronization for Cloud Storage Services
Delta synchronization (sync) is known to be crucial for network-level efficiency of cloud storage services (e.g., Dropbox). Practical delta sync techniques are, however, only available for PC clients and mobile apps, but not web browsers—the most pervasive and OSindependent access method. To understand obstacles of web-based delta sync, we implemented a traditional delta sync solution (named We...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2012